Here's how this works. First of all we're importing the right libraries. Then we're creating the elasticsearch object. Lastly, we're hunting
In [1]:
import json
import csv
import time
from pyelasticsearch import ElasticSearch
from es_methods import *
from checkers import *
import requests
Loading the settings from the settings.json file - take the template, copy it to settings.json and plug in your own values. Note that you're getting to an interesting situation. You should not enable access to your ElasticSearch from the world (or even from your corporate network), so you will need a proxy.
Quick and easy is to ssh proxy into the ELK box:
ssh -L 9200:127.0.0.1:9200 root@elk_server
In [2]:
settings=json.load(open('settings.json'))
es = ElasticSearch(settings['elasticsearch'])
query={
"filter" : {
"not" : {
"exists" : { "field" : "evil" }
}
}, "size":20
}
Now that we set up the basics, we can look through the executables and check virustotal. Notice the timer - we need to make sure not to overrrun the VirusTotal API. If you would like some help with the VT api, head over to: https://www.virustotal.com/en/documentation/public-api/
In [3]:
es_res=es.search(query,index='ioc_v2',doc_type='md5')
for res in es_res['hits']['hits']:
vt_res=check_virustotal(res['_source']['hash'],settings['virustotal'])
time.sleep(20)
if vt_res and vt_res['response_code']==1 and vt_res['positives']>=4:
print "ding ding ding!!!", res['_source']
es.update(index='ioc_v2',doc_type='md5',doc={'evil_score':vt_res['positives']*2,'evil':True},id=res['_id'])
elif vt_res and vt_res['response_code']==1 and vt_res['positives']==0:
print "probably good", res['_source']['FullName']
es.update(index='ioc_v2',doc_type='md5',doc={"evil":False},id=res['_id'])
We can check the rest against the Team Cymru database
In [8]:
es_res=es.search(query,index='ioc_v2',doc_type='md5')
for res in es_res['hits']['hits']:
if check_cymru(res['_source']['hash']):
print "probably wicked", res['_source']['hash']
If you have way too much time and computing resources on your hand, you can always md5 an entire good image and record the results in ElasticSearch
OR
You can download the National Software Reference Library (http://www.nsrl.nist.gov/Downloads.htm) and upload it into ElasticSearch for a list of known good hashes. Important caveat: the file is just about 20GB in size (155,213,072 hashes), so you'll need a better ELK stack than the VM
In [ ]: